首页> 外文OA文献 >An Object-Oriented Regression for Building Disease Predictive Models with Multiallelic HLA Genes
【2h】

An Object-Oriented Regression for Building Disease Predictive Models with Multiallelic HLA Genes

机译:用于构建具有多等位基因HLa基因的疾病预测模型的面向对象回归

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Recent genome-wide association studies confirm that human leukocyte antigen (HLA) genes have the strongest associations with several autoimmune diseases, including type 1 diabetes (T1D), providing an impetus to reduce this genetic association to practice through an HLA-based disease predictive model. However, conventional model-building methods tend to be suboptimal when predictors are highly polymorphic with many rare alleles combined with complex patterns of sequence homology within and between genes. To circumvent this challenge, we describe an alternative methodology; treating complex genotypes of HLA genes as "objects" or "exemplars," one focuses on systemic associations of disease phenotype with "objects" via similarity measurements. Conceptually, this approach assigns disease risks base on complex genotype profiles instead of specific disease-associated genotypes or alleles. Effectively, it transforms large, discrete, and sparse HLA genotypes into a matrix of similarity-based covariates. By the Kernel representative theorem and machine learning techniques, it uses a penalized likelihood method to select disease-associated exemplars in building predictive models. To illustrate this methodology, we apply it to a T1D study with eight HLA genes (HLA-DRB1, HLA-DRB3, HLA-DRB4, HLA-DRB5, HLA-DQA1, HLA-DQB1, HLA-DPA1, and HLA-DPB1) to build a predictive model. The resulted predictive model has an area under curve of 0.92 in the training set, and 0.89 in the validating set, indicating that this methodology is useful to build predictive models with complex HLA genotypes.
机译:最近的全基因组关联研究证实,人类白细胞抗原(HLA)基因与几种自身免疫性疾病(包括1型糖尿病(T1D))具有最强的关联,从而为减少这种遗传关联提供了动力,从而可以通过基于HLA的疾病预测模型进行实践。但是,当预测因子高度多态且具有许多罕见等位基因,且基因内部和基因之间存在复杂的序列同源性模式时,常规的模型构建方法往往次优。为了避免这一挑战,我们描述了一种替代方法。将HLA基因的复杂基因型视为“对象”或“示例”,人们将重点放在通过相似性测量将疾病表型与“对象”进行系统关联。从概念上讲,此方法基于复杂的基因型概况而非特定的疾病相关基因型或等位基因来确定疾病风险。有效地,它将大型,离散和稀疏的HLA基因型转换为基于相似度的协变量矩阵。通过核代表定理和机器学习技术,它使用惩罚似然法在建立预测模型中选择与疾病相关的样本。为了说明这种方法,我们将其应用于具有八个HLA基因(HLA-DRB1,HLA-DRB3,HLA-DRB4,HLA-DRB5,HLA-DQA1,HLA-DQB1,HLA-DPA1和HLA-DPB1)的T1D研究中建立预测模型。所得的预测模型在训练集中的曲线下面积为0.92,在验证集中的曲线下面积为0.89,表明该方法对于构建具有复杂HLA基因型的预测模型很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号